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• Title: RICE MLH1 ORTHOLOG AND USES THEREOF 
Inventor(s): Pramod B. Mahajan 
Application No: Not yet assigned 
AttyDktNo: 35718/238971 (5718-142) 

Complete Nucleotide and Deduced Amino Acid Sequence of Rice homolog _ of MLH1 

1 CGGCACGAGATTTTGCAGTCTCCTCTCCTCCTCCGCTCGAGCGAGTGAGTCCCGACCACG 60 

61 TCGCTGCCCTCGCCTCACCGCCGGCCAACCGCCGTGACGAGAGATCGAGCAGGGCGGGGC 120 

121 ATGGACGAGCCTTCGCCGCGCGGAGGTGGGTGCGCCGGGGAGCCGCCCCGCATCCGGAGG 180 
MetAspGluProSerProArgGlyGlyGlyCysAlaGlyGluProProArglleArgArg 

181 TTGGAGGAGTCGGTGGTGAACCGCATCGCGGCGGGGGAGGTGATCCAGCGGCCGTCGTCG 240 
LeuGluGluSerValValAsnArglleAlaAlaGlyGluVallleGlnArgProSerSer 

241 GCGGTGAAGGAGCTCATCGAGAACAGCCTCGACGCTGGCGCCTCCAGCGTCTCCGTTGCG 300 
AlaValLysGluLeuIleGluAsnSerLeuAspAlaGlyAlaSerSerValSerValAla 

301 GTGAAGGACGGTGGCCTCAAGCTCATCCAGGTCTCCGATGACGGCCATGGCATCAGGTTT 360 
ValLysAspGlyGlyLeuLysLeuIleGlnValSerAspAspGlyHisGlylleArgPhe 

361 GAGGATTTGGCAATATTGTGCGAAAGGCATACTACCTCAAAGTTATCTGCATACGAGGAT 420 
GluAspLeuAlalleLeuCysGluArgHisThrThrSerLysLeuSerAlaTyrGluAsp 

421 CTGCAGACCATAAAATCGATGGGGTTCAGAGGGGAGGCTTTGGCTAGTATGACTTATGTT 480 
LeuGlnThrlleLysSerMetGlyPheArgGlyGluAlaLeuAlaSerMetThrTyrVal 

481 GGCCATGTTACCGTGACAACGATAACAGAAGGCCAATTGCACGGCTACAGGGTTTCTTAC 540 
GlyHisValThrValThrThrlleThrGluGlyGlnLeuHisGlyTyrArgValSerTyr 

541 AGAGATGGTGTAATGGAGAATGAGCCTAAGCCTTGCGCTGCGGTGAAAGGAACTCAAGTC 600 
ArgAspGlyValMetGluAsnGluProLysProCysAlaAlaValLysGlyThrGlnVal 

«•«••« 

601 ATGG TT GAAAAT CT AT T TT ACAACATGGTAGCCCGCAAGAAAACAT T GCAGAACT CCAAT 660 
MetValGluAsnLeuPheTyrAsnMetValAlaArgLysLysThrLeuGlnAsnSerAsn 

661 GATGACTACCCCAAGATCGTAGACTTCATCAGTCGGTTTGCAGTCCATCACATCAACGTT 720 
AspAspTyrProLysIleValAspPhelleSerArgPheAlaValHisHisIleAsnVal 

721 ACC T T CT CT TGCAGAAAGCAT GGAGCC AATAGAGCAGAT G T TCAT AG TGCAAG TACAT CC 780 
ThrPheSerCysArgLysHisGlyAlaAsnArgAlaAspValHisSerAlaSerThrSer 

781 TCAAGGTTAGATGCTATCAGGAGTGTCTATGGGGCTTCTGTCGTTCGTGATCTCATAGAA 840 
SerArgLeuAspAlalleArgSerValTyrGlyAlaSerValValArgAspLeuIleGlu 
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841 ATAAAGGTTTCATATGAGGATGCTGCAGATTCAATCTTCAAGATGGATGGTTACATCTCA 900 
IleLysValSerTyrGluAspAlaAlaAspSerllePheLysMetAspGlyTyrlleSer 

901 AATGCAAATTATGTGGCAAAGAAGATTACAATGATTCTTTTCATAAATGATAGGCTTGTA 960 
AsnAlaAsnTyrValAlaLysLysIleThrMetlleLeuPhelleAsnAspArgLeuVal 

961 GACTGTACTGCTTTGAAAAGAGCTATTGAATTTGTGTACTCTGCAACATTGCCTCAAGCA 1020 
AspCysThrAlaLeuLysArgAlalleGluPheValTyrSerAlaThrLeuProGlnAla 

1021 TCCAAACCTTTCATATACATGTCCATACATCTTCCATCAGAACACGTGGATGTTAATATA 1080 
SerLysProPhelleTyrMetSerlleHisLeuProSerGluHisValAspValAsnlle 

* • • * * * 

1081 CACCCAACCAAGAAAGAGGTTAGCCT TTTGAATCAAGAGCGTATT AT TGAAACAATAAGA 1140 
HisProThrLysLysGluValSerLeuLeuAsnGlnGluArgllelleGluThrlleArg 

* • * • • * 

1141 AAT GCTATTGAGGAAAAACTGATGAAT TCT AAT ACAACCAGGAT AT TCCAAACTCAGGCA 1200 
AsnAlalleGluGluLysLeuMetAsnSerAsnThrThrArgllePheGlnThrGlnAla 

1201 TTAAACTTATCAGGGATTGCTCAAGCTAACCCACAAAAGGATAAGGTTTCTGAGGCCAGT 1260 
LeuAsnLeuSerGlylleAlaGlnAlaAsnProGlnLysAspLysValSerGluAlaSer 

* • • • • • 

1261 ATGGGTTCTGGAACAAAATCTCAAAAAATTCCTGTGAGCCAAATGGTCAGAACAGATCCA 1320 
MetGlySerGlyThrLysSerGlnLysIleProValSerGlnMetValArgThrAspPro 

1321 CGCAATCCATCTGGAAGATTGCACACCTACTGGCACGGGCAATCTTCAAATCTTGAAAAG 1380 
ArgAsnProSerGlyArgLeuHisThrTyrTrpHisGlyGlnSerSerAsnLeuGluLys 

* • • • • 4 

1381 AAATT TGATCT T GTAT CTGTAAGAAATGTTGT AAGATCAAGGAGAAACCAAAAAGATGCT 1440 
LysPheAspLeuValSerValArgAsnValValArgSerArgArgAsnGlnLysAspAla 

1441 GGTGATTTGTCAAGCCGTCATGAGCTCCTTGTGGAAATAGATTCTAGCTTCCATCCTGGC 1500 
GlyAspLeuSerSerArgHisGluLeuLeuValGluIleAspSerSerPheHisProGly 

* • • * • ■ 

1501 CTTTTGGACATTGTCAAGAACTGCACATATGTTGGACTTGCCGATGAAGCCTTTGCTTTG 1560 
LeuLeuAspIleValLysAsnCysThrTyrValGlyLeuAlaAspGluAlaPheAlaLeu 

1561 ATACAACACAATACCCGCTTATACCTTGTAAATGTGGTAAATATTAGTAAAGAACTTATG 1620 
IleGlnHisAsnThrArgLeuTyrLeuValAsnValValAsnlleSerLysGluLeuMet 

1621 TACCAGCAAGCTTTGTGCCGTTTTGGGAACTTCAATGCTATTCAGCTCAGTGAACCAGCT 1680 
TyrGlnGlnAlaLeuCysArgPheGlyAsnPheAsnAlalleGlnLeuSerGluProAla 
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1681 CCACTTCAGGAGTTGCTGGTGATGGCACTGAAAGACGATGAATTGATGAGTGATGAAAAG 1740 
ProLeuGlnGluLeuLeuValMetAlaLeuLysAspAspGluLeuMetSerAspGluLys 

17 41 GATGATGAGAAACTGGAGATTGCAGAAGTAAACACTGAGATACTAAAAGAAAATGCTGAG 1800 

AspAspGluLysLeuGluIleAlaGluValAsnThrGluIleLeuLysGluAsnAlaGlu 

1801 ATGATTAATGAGTACTTTTCTATTCACATTGATCAAGATGGCAAATTGACAAGACTTCCT 18 60 
MetlleAsnGluTyrPheSerlleHisIleAspGlnAspGlyLysLeuThrArgLeuPro 

*■••»• 

18 61 GTTGTACTGGACCAGTACACCCCTGATATGGACCGTCTTCCAGAATTTGTGTTGGCTTTA 1920 

ValValLeuAspGlnTyrThrProAspMetAspArgLeuProGluPheValLeuAlaLeu 

1921 GGAAATGATGTTACTTGGGATGACGAGAAAGAGTGCTTCAGAACAGTAGCTTCTGCTGTA 1980 
GlyAsnAspValThrTrpAspAspGluLysGluCysPheArgThrValAlaSerAlaVal 

1981 GGAAACTTCTATGCACTTCATCCCCCAATCCTTCCAAATCCATCTGGGAATGGCATTCAT 2040 
GlyAsnPheTyrAlaLeuHisProProIleLeuProAsnProSerGlyAsnGlylleHis 

2041 TTATACAAGAAAAATAGAGATTCAATGGCT GATGAACATGCTGAGAATGATCTAATATCA 2100 
LeuTyrLysLysAsnArgAspSerMetAlaAspGluHisAlaGluAsnAspLeuIleSer 

2101 GATGAAAATGACGTTGATCAAGAACTTCTTGCGGAAGCAGAAGCAGCATGGGCCCAACGT 2160 
AspGlxiAsnAspValAspGlnGluLeuLeuAlaGluAlaGluAlaAlaTrpAlaGlnArg 



2161 GAGTGGACCATTCAGCATGTCTTGTTTCCATCCATGCGACTTTTCCTCAAGCCCCCGAAG 2220 
GluTrpThrlleGlnHisValLeuPheProSerMetArgLeuPheLeuLysProProLys 



2221 TCAATGGCAACAGATGGAACGTTTGTGCAGGT TGCT TCCT TGGAGAAACTCT ACAAGAT T 2280 
SerMetAlaThrAspGlyThrPheValGlnValAlaSerLeuGluLysLeuTyrLysIle 



2281 TTTGAAAGGTGTTAGCTCATAAGTGAGAAAATGAAGGCAGAGTAAGATCATGATTCATGG 2340 
PheGluArgCysEnd 



2341 AGTGTTTTTGAAAATGTGTATAATTTCACCGTATTATGTACTTTGATAGTGTCTGTAGAA 2400 
2401 ACTGAAGAAAGAAAGATGGCTTTACTTCTGAATTGAAAGTTAACGATGCCAGCAATTGTA 24 60 
24 61 TAT T C T GAT CAAC C AAAAAAAAAAAAAAAAAAAAAAAAAAA 2501 
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AminoAcid Sequence of Rice Homolog of MLHL 
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101 
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uUl if J\l v JJr 1 


SRFAVHHINV 


201 


TFSCRKHGAN RADVHSASTS 


CDT nJ\TDO\7V 


*j Ab VV K U Li 1 tL 


IKVSYEDAAD 


251 


SIFKMDGYIS NANYVAKKIT 






FVYSATLPQA 


301 


SKPFIYMSIH LPSEHVDVNI 


HPTKKEVSLL 


NQERIIETIR 


NAIEEKLMNS 


351 


NTTRIFQTQA LNLSGIAQAN 


PQKDKVSEAS 


MGSGTKSQKI 


PVSQMVRTDP 


401 


RNPSGRLHTY WHGQSSNLEK 


KFDLVSVRNV 


VRSRRNQKDA 


GDLSSRHELL 


451 


VEIDSSFHPG LLDIVKNCTY 


VGLADEAFAL 


IQHNTRLYLV 


NWNISKELM 


501 


YQQALCRFGN FNAIQLSEPA 


PLQELLVMAL 


KDDELMSDEK 


DDEKLEIAEV 


551 


NTEILKENAE MINEYFSIHI 


DQDGKLTRLP 


WLDQYTPDM 


DRLPEFVLAL 


601 


GNDVTWDDEK ECFRTVASAV 


GNFYALHPPI 


LPNPSGNGIH 


LYKKNRDSMA 


651 


DEHAENDLIS DENDVDQELL AEAEAAWAQR 


EWTIQHVLFP 


SMRLFLKPPK 


701 


SMATDGTFVQ VASLEKLYKI 


FERC* 







mutL/PMSl signature sequence is shown in bold. 
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Amino Acid Sequence Comparison of Rice and Arabidopsis mutL Homology 



2 DEPSPRGGGCAGEPPRIRRLEESVVNRIAAGEVIQRPSSAVKELIENSLD 51 
: I II I I I : I . I I M I I I I I I I I I I I I I I I 111111:11111 

13 EEESPATTIVPREPPKIQRLEESWNRIAAGEVIQRPVSAVKELVENSLD 62 

52 AGASSVSVAVKDGGLKLIQVSDDGHGIRFEDLAILCERHTTSKLSAYEDL 101 

I . I I : I I ! I 1 M I I I I I I I I II I I I I III I I I I I I II I I I . : I I I 
63 ADSS S I S WVKDGGLKL I QVS DDGHG I RREDLP I LCERHTTSKLTKFEDL 112 

• ♦ * • » 

102 QTIKSMGFRGEALASMTYVGHVTVTTITEGQLHGYRVSYRDGVMENEPKP 151 

I I I I I I I I I I I 1 I I I I I I! I If !.!!:(! I M ! Ml M I ! .M I 
113 FSLSSMGFRGEALASMTYVAHVTVTTITKGQIHGYRVSYRDGVMEHEPKA 162 

• • • • • 

152 CAAVKGTQVMVENLFYNMVARKKTLQNSNDDYPKIVDFISRFAVHHINVT 201 

I I I! I I M : I i M I I t II : I 1 : I I M I J III I I I I :|| |:|: ||. 
163 CAAVKGTQIMVENLFYNMIARRKTLQNSADDYGKIVDLLSRMAIHYNNVS 212 

• • • • ■ 

202 FS CRKH GANRADVH S AS TS S RLDAI RS VYGAS WRDL I E I KVS YE DAADS 251 

IIMIMI :||||| . MM. IIMM II :.|.-:.|| |.. 
213 FSCRKHGAVKADVHSWSPSRLDSIRSVYGVSVAKNLMECVEVSSCDSSGC 262 

252 IFBCMDGYISNANYVAKKITMILFINDRLVDCTALKRAIEFVYSATLPQAS 301 

I I : I : II M I I I I I : : I I I I I I I I : I . 1 I I I I I I I I . II I I * [ I 
263 T FDMEGFI SNSN YVAKKT I LVLFINDRLVECSALKRAI E I VYAATLPKAS 312 

302 KPFIYMSIHLPSEHVDVNIHPTKKEVSLLNQERIIETIRNAIEEKLMNSN 351 

KhlllMI I I I I : I I I I I I I I I I I I I I I Ml I*. :| II I « I 
313 KPFVYMSINLPREHVDINIHPTKKEVSLLNQEIIIEMIQSEVEVKLRNAN 362 

352 TTRI FQTQALNLSGIAQANPQKDKVSEASMGSGTKSQKI PVSQMVRTDPR 401 

Kill. . II • II KII:||..IKII 

363 DTRTFQEQKVEYIQ . STLTSQKSDSPVSQKPSGQKTQKVPVNKMVRTDSS 411 

402 NPSGRLHT YWHGQSSNLEBCKFDLVS . VRNWRSRRNQKDAGDLSSRHELL 450 

.1.1111 : . .1 I .1 II. II III I: KM II: 
412 DPAGRLHAFLQPKPQSLPDKVSSLSWRSSVRQRRNPKETADLSSVQELI 4 61 

• *•♦.. 

451 VE I DS S FH PGLLDI VKNCT YVGLADEAFALIQHNTRLYLVN WNI SKELM 500 

:M IKK: KIIIKIKK KhlKI ill IIIKIIKI 
462 AGVDSCCHPGMLETVRNCTWGMADDVFALVQYNTHLYIANVVNLSKELM 511 

• • • ♦ • 

501 YQQALCRFGNFNAIQLSEPAPLQELLVMALKDDEL . • MSDEKDDEKLEIA 548 

I I I I I I . I II II I I : I I I I II : . : I II : : : I * I K I I I I 
512 YQQTLRRFAHFNAIQLSDPAPLSELILLALKEEDLDPGNDTKDDLKERIA 561 

549 EVNTEILKENAEMINEYFSIHIDQDGKLTRLPWLDQYTPDMDRLPEFVL 598 

1.111:111 III: IIIKIM I . I I I I : K I I I I II I I . I I M 
562 EMHTELLKEKAEMLEEYFSVHIDSSANLSRLPVILDQYTPDMDRVPEFLL 611 
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599 ALGNDVTWDDEKECFRTVASAVGNFYALHPPILPNPSGNGIHLYKKNRDS 648 

IMM 1:111 II. I I : I I I I I : I I I : I I I I I I . I I | I : | 
612 CLGN DVEWE DEKS C FQG VS AAI GNFYAMH P PLL PN PSGDG IQ F YS KRGES 661 



64 9 MADEHAENDL I S DENDVDQELLAEAEAAWAQREWT I QH VLFPSMRLFLKP 698 

: l...||:ll.:|l II I I I I I . I M I I I I I I I I I I I I 
662 SQEKSDLEGNVDMEDNLDQDLLSDAENAWAQREWSIQHVLFPSMRLFLKP 711 

* ♦ 
699 PKSMATDGTFVQVASLEKLYKI FERC 724 

I I I I - • I I I I • I I I I I I I I I I I I I I 
712 PASMASNGTFVKVASLEKLYKI FERC 737 



Deduced amino acid sequences of Oryza sativa and Arabidopsis thaliana (Genbank ID, 
SP_PL:Q9ZRV4) were compared using the Bestfit program of GCG. 



FIGURE 3B 



Title: RICE MLH1 ORTHOLOG AND USES THEREOF 
Inventor(s): Pramod B. Mahajan 
Application No: Not yet assigned 
AttyDktNo: 35718/238971 (5718-142) 



Comparison of cDNA sequences ofMLHl orthologs from A. thaliana and O. sativa 



158 GGGAGCCGCCCCGCMCCGGAGGTTGGAGGAGTCGGTGGTGAACCGCATC 2 07 

i inn ii ii i i ii ii ii ii ii ii inn mi 

73 GAGAGCCACCGAAGATTCAACGCTTAGAAGAATCAGTAGTCAACCGTATC 122 
208 GCGGCGGGGGAGGTGATCCAGCGGCCGTCGTCGGCGGTGAAGGAGCTCAT 257 

ii ii 1 1 ii ii iiiiiMi ii ii ii 1 1 1 1 1 Mini i 

123 GCAGCTGGTGAAGTAATCCAGCGTCCAGTTTCAGCTGTGAAAGAGCTCGT 172 
258 CGAGAACAGCCTCGACGCTGGCGCCTCCAGCGTCTCCGTTGCGGTGAAGG 307 

i 1 1 1 1 1 Mill i I i 1 1 1 I ! || III I II III 

173 TGAGAACAGCCTCGACGCCGATTCAAGTTCCATAAGCGTCGTTGTCAAAG 222 

* ♦ • • . 
308 ACGGTGGCCTCAAGCTCATCCAGGTCTCCGATGACGGCCATGGCATCAGG 357 

Mlllll I II I i I 1 f II llllllll Mill II || || | | 
223 ACGGTGGTTTGAAACTCATTCAAGTCTCCGACGACGGTCACGGTATTAGA 272 

358 TTTGAGGATTTGGCAATATTGTGCGAAAGGCATACTACCTCAAAGTTATC 407 

Ml II III I III I Mill II Mill II II III I I 

273 CGTGAAGACTTGCCGATACTATGCGAGAGACATACAACATCGAAGCTGAC 322 

• * * • . 

408 TGCATACGAGGATCTGCAGACCATAAAATCGATGGGGTTCAGAGGGGAGG 457 

i i hum ii i i i ii urn ii inn mi 

323 TAAGTTTGAGGATTTGTTCTCTCTGAGTTCAATGGGATTTAGAGGAGAGG 3 72 
458 CTTTGGCTAGTATGACTTATGTTGGCCATGTTACCGTGACAACGATAACA 507 

I II 1 1 1 1 1 1 1 1 1 1 1 Mlllll IIIIIMI Mill II II II 

373 CATTAGCTAGTATGACCTATGTTGCTCATGTTACAGTGACTACTATTACT 422 
508 GAAGGCCAATTGCACGGCTACAGGGTTTCTTACAGAGATGGTGTAATGGA 557 

Mlllll I II II II II II Mill lllllllllll Mill 

423 AAAGGCCAGATTCATGGTTATAGAGTGTCTTATAGAGATGGTGTCATGGA 472 
558 GAATGAGCCTAAGCCTTGCGCTGCGGTGAAAGGAACTCAAGTCATGGTTG 607 

I MM II III I II Mill II MIIIMI II I (MM I 

473 GCATGAACCAAAGGCGTGTGCTGCTGTCAAAGGAACACAGATAATGGTGG 522 
* 

608 AAAATCTATTTTACAACATGGTAGCCCGCAAGAAAACATTGCAGAACTCC 657 

I Ml I II Mill III I II I I III Mi I II II II 

523 AGAATTTGTTCTACAATATGATTGCTAGAAGGAAGACACTTCAAAATTCT 572 

• • • 

658 AATGATGACTACCCCAAGATCGTAGACTTCATCAGTCGGTTTGCAGTCCA 707 

mill mi ii mil ii ii i ii iii iii iii 

573 GCTGATGATTACGGGAAAATCGTGGATTTGCTGAGCCGGATGGCTATTCA 622 

708 TCACATCAACGTTACCTTCTCTTGCAGAAAGCATGGAGCCAATAGAGCAG 757 

I Ml II II I III Mill I II I I I i I I I ' | I M II I 
623 TTACAATAATGTCAGCTTTTCTTGTCGAAAGCATGGAGCTGTTAAGGCTG 672 

758 ATGTTCATAGTGCAAGTACATCCTCAAGGTTAGATGCTATCAGGAGTGTC 807 

I II M 1 1 I II I MUM I Ml I M Ml Ml 

673 ATGTTCACTCAGTCGTGTCACCTTCAAGGCTTGATTCAATTAGGTCTGTA 722 
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808 TATGGGGCTTCTGTCGTTCGTGATCTCATAGAAATAAAGGTTTCATATGA 857 

III 1 II II I I I II MM I MM! I 

723 TATGGAGTATCAGTTGCAAAGAACTTGATGAAAGTAGAAGTTTCCTCCTG 772 
858 GGATGCTGCAGATTCAATCTTCAAGATGGATGGTTACATCTCAAATGCAA 907 

II I I I II I II I Mill MM III M Ml I I 

7 73 TGACTCCTCTGGTTGTACTTTTGATATGGAGGGTTTCATATCCAATTCTA 822 
908 ATTATGTGGCAAAGAAGATTACAATGATTCTTTTCATAAATGATAGGCTT 957 

I Mill II IMIMI II I II I IMIIIII I II II I II I 

823 ACTATGTTGCTAAGAAGACTATATTGGTGCTTTTCATTAATGATAGATTG 872 
958 GTAGACTGTACTGCTTTGAAAAGAGCTATTGAATTTGTGTACTCTGCAAC 1007 

M II M MM II IMIIIII llllll MM II IMIMI 

873 GTGGAATGCTCTGCCTTAAAAAGAGCCATTGAAATTGTTTATGCTGCAAC 922 
1008 ATTGCCTCAAGCATCCAAACCTTTCATATACATGTCCATACATCTTCCAT 1057 

MUM llllll! IMIMM I IMIMI! M M I Ml 

923 ATTGCCAAAAGCATCAAAACCTTTTGTCTACATGTCAATCAATTTGCCAC 972 
1058 CAGAACACGTGGATGTTAATATACACCCAACCAAGAAAGAGGTTAGCCTT 1107 

Mill II III I Mill IMIMM MIMIMMMMMM 

973 GGGAACATGTTGATATCAATATTCACCCAACAAAGAAAGAGGTTAGCCTT 1022 
1108 TTGAATCAAGAGCGTATTATTGAAACAATAAGAAATGCTATTGAGGAAAA 1157 

I II II II IMIMM I III I MM I III 

1023 CTAAACCAGGAAATCATTATTGAGATGATACAGTCAGAGGTTGAAGTAAA 1072 
1158 ACTGATGAATTCTAATACAACCAGGATATTCCAAACTCAGGCATTAAACT 1207 

Mill II I III II MM II III III I I 

1073 ACTGAGAAACGCAAATGATACTAGGACGTTTCAAGAGCAGAAAGT 1117 

1208 TATCAGGGATTGC . . TCAAGCT AACCCACAAAAG GATA 1243 

II! I I MM M MM Mill I 

1118 GGAATACATTCAATCTACGTTAACATCTCAGAAAAGTGATTCTC 1161 

1244 AGGTTTCTGAGGCCAGTATGGGTTCTGGAACAAAATCTCAAAAAATTCCT 1293 

llllll II I I IMIMI Ml I II III Mill 

1162 CAGTTTCTCAG AAGCCTTCTGGACAAAAGACACAGAAAGTTCCT 1205 

12 94 GTGAGCCAAATGGTCAGAACAGATCCACGCAATCCATCTGGAAGATTGCA 1343 

MM I IMIMI IIIIIIMI II Mill MUM! II II 

1206 GTGAACAAAATGGTGAGAACAGATTCATCAGATCCAGCTGGAAGGTTACA 1255 

1344 CACCTACTGGCACGGGCAATCTTCAAATCT. . . TGAAAAGAAATTTGATC 1390 

III I III II M III 111 Ml III 

1256 TGCCTTTTTGCAACCCAAGCCACAAAGTCTCCCTGACAAGGTTTCTAGTT 1305 

1391 TTGTATCTGTAAGAAATGTTGTAAGATCAAGGAGAAACCAAAAAGATGCT 144 0 

I Mill I llllll III IMIMI III II II 

13 06 TGAGTGTAGTAAGGTCTTCTGTAAGGCAAAGAAGAAACCCAAAGGAAACT 1355 

1441 GGTGATTTGTCAAGCCGTCATGAGCTCCTTGTGGAAATAGATTCTAGCTT 1490 

I MM I II II II II II III III II III 

1356 GCTGATCTTTCTAGTGTCCAGGAACTTATTGCTGGAGTTGACAGCTGCTG 1405 
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1491 CCATCCTGGCCTTTTGGACATTGTCAAGAACTGCACATATGTTGGACTTG 1540 

MINI II I 1 1 1 1 I III I III IIIIIMIIIIIIII I I 

1406 CCATCCAGGTATGCTGGAGACTGTAAGGAATTGCACATATGTTGGAATGG 1455 
1541 CCGATGAAGCCTTTGCTTTGATACAACACAATACCCGCTTATACCTTGTA 1590 

I Mill I MINIM III I II MM MM II I I 

1456 CAGATGATGTTTTTGCTTTAGTTCAGTATAACACCCATCTATATCTAGCA 1505 
1591 AATGTGGTAAATATTAGTAAAGAACTTATGTACCAGCAAGCTTTGTGCCG 1640 

IMIIMI III I II Mill II Mill MINI II I III 

1506 AATGTGGTGAATCTCAGCAAAGAGCTAATGTATCAGCAAACTCTTCGTCG 1555 
1641 TTTTGGGAACTTCAATGCTATTCAGCTCAGTGAACCAGCTCCACTTCAGG 1690 

1 1 M I I II M M II Mill II II Mill II I I 

1556 TTTTGCTCATTTTAACGCAATACAGCTTAGCGATCCAGCCCCTTTGTCAG 1605 
1691 AGTTGCTGGTGATGGCACTGAAAGACGATGA . ATTGAT GAGTGAT 1734 

Mill I II MM IMIIMI II II I III I MM 

1606 AGTTGATATTGTTGGCTCTGAAAGAGGAGGATCTAGATCCAGGAAATGAT 1655 
1735 GAAAAGGATGATGAGAAACTGGAGATTGCAGAAGTAAACACTGAGATACT 1784 

Ml MUM MM Mill Ml I II 1 1 II I II 

1656 ACAAAAGATGATCTGAAAGAAAGAATTGCTGAAATGAATACAGAACTCCT 1705 
1785 AAAAGAAAATGCTGAGATGATTAATGAGTACTTTTCTATTCACATTGATC 1834 

II Mill II II Ml I I Mill M I IMIIMI 

1706 CAAGGAAAAAGCAGAAATGTTAGAGGAGTATTTCAGCGTGCACATTGACT 1755 
1835 AAGATGGCAAATTGACAAGACTTCCTGTTGTACTGGACCAGTACACCCCT 1884 

II II Ml MM llllllll MM IIIIMM II III 

1756 CCAGTGCAAATTTGTCAAGGCTTCCTGTGATACTCGACCAGTATACACCT 1805 
1885 GATATGGACCGTCTTCCAGAATTTGTGTTGGCTTTAGGAAATGATGTTAC 1934 

ii inn iii 1 1 1 1 mill i i n iiiiiiiiiiii 

1806 GACATGGATCGTGTTCCTGAATTTTTACTATGCTTGGGAAATGATGTTGA 1855 
1935 TTGGGATGACGAGAAAGAGTGCTTCAGAACAGTAGCTTCTGCTGTAGGAA 1984 

Mill II Mill Mill I III II I III I II I 

1856 GTGGGAAGATGAGAAGAGTTGCTTTCAAGGAGTTTCTGCAGCTATTGGGA 1905 
1985 ACTTCTATGCACTTCATCCCCCAATCCTTCCAAATCCATCTGGGAATGGC 2034 

MM II II I Mill II I I Mill Mill M I II 

1906 ACTTTTACGCCATGCATCCTCCTCTTTTGCCAAACCCATCGGGTGACGGT 1955 

• * • • • 
2035 ATTCATTTATACA AGAAAAATAGAGATTC 2063 

inn ii ii i ii mi inn 

1956 ATTCAGTTCTATAGTAAGAGAGGTGAGAGCTCTCAGGAAAAGTCAGATTT 2005 

* • * • • 
2064 AATGGCTGATGAACATGCTGAGAATGATCTAATATCAGATGAAAATGACG 2113 

I M I I I II Ml I III 
2006 AGAGGGTAACGTCGATATGGAGGACAATC 2034 

2114 TTGATCAAGAACTTCTTGCGGAAGCAGAAGCAGCATGGGCCCAACGTGAG 2163 

MM Mill Mill I II II Ml llllllll M I II II I 

2035 TTGACCAAGATCTTCTGTCAGATGCTGAAAACGCATGGGCACAACGTGAA 2 084 
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* • * • * 

2164 TGGACCATTCAGCATGTCTTGTTTCCATCCATGCGACTTTTCCTCAAGCC 2213 

„ III I II II M II MINIM || Ml || | HI | Mm 

2085 TGGTCAATCCAACACGTGTTGTTTCCGTCAATGAGATTGTTCTTGAAGCC 2134 

* * • • 

2214 CCCGAAGTCAATGGCAACAGATGGAACGTTTGTGCAGGTTGCTTCCTTGG 2263 

„, M II Mill II Mil II Mill MM II III l I 

213 5 ACCAGCTTCCATGGCTTCAAATGGGACTTTTGTAAAGGTAGCATCCCTTG 2184 
2264 AGAAACTCTACAAGATTTTTGAAAGGTGTTAGCTCATA 2301 

I M II MINIM || IN I II II II I I 

2185 AAAAGCTGTACAAGATATTCGAACGATGCTAACTGAAA 2222 
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